-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Updating the JIT to take EnableSSE3_4 into account when setting the supported instruction sets #16395
Conversation
src/jit/compiler.cpp
Outdated
@@ -2716,7 +2710,7 @@ void Compiler::compSetProcessor() | |||
codeGen->getEmitter()->SetContainsAVX(false); | |||
codeGen->getEmitter()->SetContains256bitAVX(false); | |||
} | |||
else if (CanUseSSE4()) | |||
else if (compSupports(InstructionSet_SSE41) || compSupports(InstructionSet_SSE42)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to support emitting the 4-byte encoding for both SSE4.1 and SSE4.2 instructions, even if only one of them is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original behavior is compSupports(InstructionSet_SSE41) && compSupports(InstructionSet_SSE42))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original behavior will be broken for the case where EnableSSE42=1 and EnableSSE41=0, which is completely valid to set today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. This is setting the bit in emitter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to also include InstructionSet_SSSE3
, which also uses the 0F 38
and 0F 3A
encoding.
src/jit/compiler.h
Outdated
@@ -7462,7 +7462,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | |||
return SIMD_AVX2_Supported; | |||
} | |||
|
|||
if (CanUseSSE4()) | |||
if (compSupports(InstructionSet_SSE41)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIMD only emits SSE4.1 instructions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, the legacy SIMD code can generate pcmpgtq
that is a SSE4.2 instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update. I forgot that instruction was 4.2
src/jit/compiler.cpp
Outdated
@@ -2666,47 +2666,41 @@ void Compiler::compSetProcessor() | |||
opts.setSupportedISA(InstructionSet_POPCNT); | |||
} | |||
} | |||
if (jitFlags.IsSet(JitFlags::JIT_FLAG_USE_SSE3)) | |||
|
|||
if (JitConfig.EnableSSE3_4()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only mark SSE3, SSSE3, SSE4.1, and SSE4.2 as enabled if both the EnableSSE3_4 flag and the individual ISA flag is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Updated and added explicit comments. |
src/jit/compiler.cpp
Outdated
@@ -2716,8 +2713,11 @@ void Compiler::compSetProcessor() | |||
codeGen->getEmitter()->SetContainsAVX(false); | |||
codeGen->getEmitter()->SetContains256bitAVX(false); | |||
} | |||
else if (CanUseSSE4()) | |||
else if (compSupports(InstructionSet_SSE41) || compSupports(InstructionSet_SSE42)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could create a compSupportsAny(InstructionSet_SSE41 | InstructionSet_SSE42)
function to simplify this, if we think it is worthwhile.
if (CanUseSSE4()) | ||
// SIMD_SSE4_Supported actually requires all of SSE3, SSSE3, SSE4.1, and SSE4.2 | ||
// to be supported. We can only enable it if all four are enabled in the compiler | ||
if (compSupports(InstructionSet_SSE42) && compSupports(InstructionSet_SSE41) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could create a compSupportsAll(InstructionSet_SSE3 | InstructionSet_SSSE3 | InstructionSet_SSE41 | InstructionSet_SSE42)
function to simplify this, if we think it is worthwhile.
CC. @dotnet/jit-contrib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
{ | ||
// Emitter::UseSSE4 controls whether we support the 4-byte encoding for certain | ||
// instructions. We need to check if either is supported independently, since | ||
// it is currently possible to enable/disable them separately. | ||
codeGen->getEmitter()->SetUseSSE4(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to consider changing the name of this flag, as this is a bit confusing. But I don't think it's critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'll log a bug to track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…upported instruction sets
Applied formatting patch. |
test Windows_NT x64 Checked jitsse2only test Windows_NT x64 Checked jitincompletehwintrinsic test Windows_NT x86 Checked jitincompletehwintrinsic test Ubuntu x64 Checked jitincompletehwintrinsic test OSX10.12 x64 Checked jitincompletehwintrinsic |
Resolves the error brought up here: #16378 (comment)
This updates the
setSupportedISA
calls to takeJitConfig.EnableSSE3_4
into account. This also removes thecompCanUseSSE4
field in favor of more explicitcompSupports
calls.FYI. @fiigii, @CarolEidt, @AndyAyersMS